Off-Topic Detection in Automated Speech Assessment Applications
نویسندگان
چکیده
Automated L2 speech assessment applications need some mechanism for validating the relevance of user responses before providing scores. In this paper, we discuss a method for off-topic detection in an automated speech assessment application: a high-stakes English test (PTE Academic). Different from traditional topic detection techniques that use characteristics of text alone, our method mainly focused on using the features derived from speech confidence scores. We also enhanced our off-topic detection model by incorporating other features derived from acoustic likelihood, language model likelihood, and garbage modeling. The final combination model significantly outperformed classification from any individual feature. When fixing the false rejection rate at 5% in our test set, we achieved a false acceptance rate of 9.8%. a very promising result.
منابع مشابه
Off-Topic Spoken Response Detection with Word Embeddings
In this study, we developed an automated off-topic response detection system as a supplementary module for an automated proficiency scoring system for non-native English speakers’ spontaneous speech. Given a spoken response, the system first generates an automated transcription using an ASR system trained on non-native speech, and then generates a set of features to assess similarity to the que...
متن کاملOff-topic Response Detection for Spontaneous Spoken English Assessment
Automatic spoken language assessment systems are becoming increasingly important to meet the demand for English second language learning. This is a challenging task due to the high error rates of, even state-of-the-art, non-native speech recognition. Consequently current systems primarily assess fluency and pronunciation. However, content assessment is essential for full automation. As a first ...
متن کاملOff-Topic Spoken Response Detection Using Siamese Convolutional Neural Networks
In this study, we developed an off-topic response detection system to be used in the context of the automated scoring of nonnative English speakers’ spontaneous speech. Based on transcriptions generated from an ASR system trained on non-native speakers’ speech and various semantic similarity features, the system classified each test response as an on-topic or off-topic response. The recent succ...
متن کاملSyllable and language model based features for detecting non-scorable tests in spoken language proficiency assessment applications
This work introduces new methods for detecting non-scorable tests, i.e., tests that cannot be accurately scored automatically, in educational applications of spoken language proficiency assessment. Those include cases of unreliable automatic speech recognition (ASR), often because of noisy, off-topic, foreign or unintelligible speech. We examine features that estimate signalderived syllable inf...
متن کاملOff-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011